Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type instantiation count limiter #32079

Merged
merged 3 commits into from
Jul 3, 2019
Merged

Conversation

ahejlsberg
Copy link
Member

@ahejlsberg ahejlsberg commented Jun 25, 2019

This PR adds a type instantiation count limiter. When more than 5,000,000 type instantiations occur during the checking of a single expression or statement, we report an error similiar to the maximum instantiation depth limiter.

Fixes #31823.

@ahejlsberg
Copy link
Member Author

@typescript-bot run dt
@typescript-bot test this

@typescript-bot
Copy link
Collaborator

typescript-bot commented Jun 25, 2019

Heya @ahejlsberg, I've started to run the extended test suite on this PR at 49e3f17. You can monitor the build here. It should now contribute to this PR's status checks.

@typescript-bot
Copy link
Collaborator

typescript-bot commented Jun 25, 2019

Heya @ahejlsberg, I've started to run the parallelized Definitely Typed test suite on this PR at 49e3f17. You can monitor the build here. It should now contribute to this PR's status checks.

@ahejlsberg
Copy link
Member Author

Clean DT test run, RWC looks clean too (not sure what the failures are, but don't appear related to this).

@ahejlsberg
Copy link
Member Author

@weswigham @RyanCavanaugh I'm thinking this might be a better way to fix #31823. It errors on the code in #31823 after about one second but otherwise doesn't affect any of our other tests. It certainly seems to me that any expression or statement that causes >5,000,000 type instantiations (be they cached or not) is highly suspect.

@@ -11408,13 +11409,14 @@ namespace ts {
if (!type || !mapper || mapper === identityMapper) {
return type;
}
if (instantiationDepth === 50) {
if (instantiationDepth === 50 || instantiationCount >= 5000000) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need instantiationDepth if we're also looking at a total count?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely do. The depth limiter is there to catch infinitely recursive types, we'd blow the stack way before the count limiter gets to 5M.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Although we could always modify instantiation to use a trampoline instead of direct recursion so we don't have to worry about stack depth, and then the total count of instantiations would really be the absolute measure of complexity, without need for a specific depth-measuring mechanism. (And would enable people to nest those if/else style conditionals more than 50 levels deep, like we've occasionally seen reports of - those may be deep, but they're often trivial)

@AnyhowStep
Copy link
Contributor

AnyhowStep commented Jun 26, 2019

Sooooo.... What's the probability of me running into this limit? Does tsc currently tell you how many instantiations there are per type?

@weswigham
Copy link
Member

Does tsc currently tell you how many instantiations there are per type?

you can get the absolute count of types with --extendedDiagnostics

What's the probability of me running into this limit?

@typescript-bot pack this so you can try

@typescript-bot
Copy link
Collaborator

typescript-bot commented Jun 26, 2019

Heya @weswigham, I've started to run the tarball bundle task on this PR at 49e3f17. You can monitor the build here. It should now contribute to this PR's status checks.

@typescript-bot
Copy link
Collaborator

Hey @weswigham, I've packed this into an installable tgz. You can install it for testing by referencing it in your package.json like so:

{
    "devDependencies": {
        "typescript": "https://typescript.visualstudio.com/cf7ac146-d525-443c-b23c-0d58337efebc/_apis/build/builds/34499/artifacts?artifactName=tgz&fileId=A9DDBE9B8A640478E186DB462FF9CF077B58115484334C20715C5FBC262DEE9502&fileName=/typescript-3.6.0-insiders.20190626.tgz"
    }
}

and then running npm install.

@fatcerberus
Copy link

fatcerberus commented Jun 26, 2019

@AnyhowStep You can use --diagnostics (I think it is) to see stats for a build - I’ve never seen 5M instantiations for an entire project, let alone a single expression.

edit: I got ninja’d by like 20 minutes... that’ll teach me to refresh the page before posting...

@RyanCavanaugh
Copy link
Member

RyanCavanaugh commented Jun 26, 2019

What's the probability of me running into this limit?

The odds you'd hit this limit without either an adverserially-crafted repro or a program that would have generated an unbounded number of types are extremely low. A compilation generating 5M types wouldn't be one you would want to wait for to finish.

@ahejlsberg
Copy link
Member Author

ahejlsberg commented Jun 26, 2019

Note that the 5M limit is an absolute limit on requested type instantiations during the checking of a single expression or statement node, not a limit on unique instantiated types. For the counter to reach 5M a substantial portion of the types would have to be types that were previously instantiated and cached. The particular repro in #31823 does indeed request the same type instantiations over and over and then combines them into a relatively smaller number of new types. The core idea with the counter is to monitor work performed vs. resources consumed.

@ahejlsberg
Copy link
Member Author

Also note that --diagnostics and --extendedDiagnostics report the number of unique types created by a compilation, not the number of type instantiations requested. The latter is not a particularly useful metric for an entire compilation.

@AnyhowStep
Copy link
Contributor

Woohoo!
image

@weswigham
Copy link
Member

What connotation does "Woohoo!" have? lol

@AnyhowStep
Copy link
Contributor

AnyhowStep commented Jun 26, 2019

By the way, this is a composite project made up of 34 sub-projects.

It takes 9 minutes to compile from scratch.

However, most of these sub-projects take about 20-30s to compile.

The screenshot above is for one of those sub-projects that usually takes 20-30s to compile.
With 3.5.1, anyway.

I modified the package @weswigham packed for me to print out the highest instantiation count found per project. It looks like I do run into the 5 million limit!


[Edit]

I see a bunch of these projects having instantiationCount around 40k, and some 100k and 200k.


I'm actually rather surprised that particular sub-project runs into the 5 million limit.
All it does is map data from one format to another. (The object it maps is pretty deeply nested, though).

It usually takes 20-30s to build.


I suspected that a different sub-project would run into problems but it didn't.
That other project uses a SQL query builder that has a lot of complicated type checks.

It takes 70s to build at the moment, instantiates 100k at most.


[Edit]

I went "Woohoo!" because I was excited that I ran into the limit. Then the excitement wore off and I got a little sad. Then I realized that running into the limit is a bad thing because it means my projects will break =(


[Edit]

A compiler option to disable the limit would be nice =(


[Edit]

In case anyone is interested in part of the output from building,

https://pastebin.com/xWSMY9DL

I just found 768805 as the second highest instantiation count

@fatcerberus
Copy link

The limit implemented here is for 5M instantiation requests per expression/statement, hitting that regardless of project size seems... very surprising!

@AnyhowStep
Copy link
Contributor

AnyhowStep commented Jun 26, 2019

I removed the instantiationCount limit but kept the tracking and rebuilt,

New highest instantiation count 1
New highest instantiation count 5
New highest instantiation count 30
New highest instantiation count 33
New highest instantiation count 51
New highest instantiation count 98
New highest instantiation count 123
New highest instantiation count 198
New highest instantiation count 243
New highest instantiation count 353
New highest instantiation count 564
New highest instantiation count 8726
New highest instantiation count 22709
New highest instantiation count 65443
New highest instantiation count 142611
New highest instantiation count 143591
New highest instantiation count 351123
New highest instantiation count 2885368
New highest instantiation count 7746159
Files:                        677
Lines:                      19412
Nodes:                     118212
Identifiers:                37633
Symbols:                    93488
Types:                      33011
Memory used:              759852K
Assignability cache size:   19382
Identity cache size:          625
Subtype cache size:            13
I/O Read time:              0.01s
Parse time:                 0.00s
Program time:               0.06s
Bind time:                  0.01s
Check time:                 1.70s
transformTime time:         1.89s
Source Map time:            0.02s
commentTime time:           0.08s
printTime time:             1.81s
Emit time:                  1.81s
I/O Write time:             0.03s
Total time:                 3.58s

Notice that it checks in 1.70s and emits in 1.81s


I think I could quite easily make a repo that reproduces this, if necessary.


This is another sub-project. The compile-safe MySQL query builder sub-project.

New highest instantiation count 57
New highest instantiation count 652
New highest instantiation count 774
New highest instantiation count 3522
New highest instantiation count 6092
New highest instantiation count 15173
New highest instantiation count 45721
New highest instantiation count 47464
New highest instantiation count 51718
New highest instantiation count 52706
New highest instantiation count 113510
Files:                       2500
Lines:                      82515
Nodes:                     451677
Identifiers:               136747
Symbols:                   793105
Types:                     275633
Memory used:              638102K
Assignability cache size:  193968
Identity cache size:          227
Subtype cache size:         13240
I/O Read time:              0.03s
Parse time:                 0.16s
Program time:               0.49s
Bind time:                  0.11s
Check time:                11.84s
transformTime time:        78.96s
Source Map time:            0.10s
commentTime time:           0.66s
printTime time:            57.95s
Emit time:                 57.97s
I/O Write time:             0.19s
Total time:                70.41s

It takes way longer to check, 11.84s, but 113,510 is its highest instantiation count.

@AnyhowStep
Copy link
Contributor

AnyhowStep commented Jun 27, 2019

So, I just created a new monster query in the MySQL sub-project and built it,

New highest instantiation count 57
New highest instantiation count 652
New highest instantiation count 774
New highest instantiation count 3522
New highest instantiation count 6092
New highest instantiation count 15173
New highest instantiation count 45721
New highest instantiation count 47464
New highest instantiation count 51718
New highest instantiation count 52706
New highest instantiation count 14104531
New highest instantiation count 14912094
Files:                       2506
Lines:                      82774
Nodes:                     452834
Identifiers:               137142
Symbols:                   811196
Types:                     277451
Memory used:              698316K
Assignability cache size:  212155
Identity cache size:          227
Subtype cache size:         13241
I/O Read time:              0.10s
Parse time:                 0.55s
Program time:               1.50s
Bind time:                  0.37s
Check time:                18.78s
transformTime time:        74.84s
Source Map time:            0.14s
commentTime time:           0.69s
printTime time:            57.04s
Emit time:                 57.08s
I/O Write time:             1.07s
Total time:                77.72s

It adds 7 seconds to the check time (11+7 = 18).
It adds zero seconds to the emit time (57+0 = 57).
It adds 7 seconds to the total time (70+7 = 77).


It stays under the instantiationDepth of 50 but it exceeds the instantiationCount of 5,000,000 and goes up to 14,912,094

The generated sql string is 6k+ lines and 206,408 characters.

The query selects 50+ expressions, with sub-expressions nested pretty deeply and it is all type-checked.

So, I'm guessing the query is one of those "wide" types but also pretty "deep".

@ahejlsberg ahejlsberg marked this pull request as ready for review June 28, 2019 02:22
@ahejlsberg
Copy link
Member Author

@AnyhowStep I think the 5M instantiation count limit is quite reasonable. In fact, I feel it is pretty much an order of magnitude above reasonable. Any single expression or statement node that generates that much work for the type checker is highly suspect and beyond what we can promise to support with reasonable performance as the compiler evolves.

@ahejlsberg
Copy link
Member Author

@weswigham @RyanCavanaugh I think we should merge this.

@AnyhowStep
Copy link
Contributor

AnyhowStep commented Jun 28, 2019

Checking one expression with 7.7M instantiations and hundreds of other smaller expressions in 2s isn't a long time, though.

Even the 14M one checks in 7s, which isn't too bad.

Couldn't the limit be higher, if dead set on a limit?

I'd like to be able to trade away some performance for extra type safety. The expressions with 7M and 14M instantiations I have are a necessary part of the project that isn't at all like #31823

@falsandtru
Copy link
Contributor

The generated sql string is 6k+ lines and 206,408 characters.

This is too large. At least you should show your actual code, to consider its validness.

@AnyhowStep
Copy link
Contributor

AnyhowStep commented Jun 28, 2019

It's for work. I can probably make a repo for the 7M expression, that part isn't too revealing.

The 14M expression, I'd rather only mail it to the TS team, if they want to look at it.

The 6k+ line SQL query is valid. It fulfills an actual business requirement for work. Unless you mean validity in another sense.

The TS code that generates the large SQL query is just a simple SELECT statement with dozens of expressions. But each expression is made of deeply nested sub-expressions and that's where the real "complexity" is

@ahejlsberg ahejlsberg merged commit 440ed83 into master Jul 3, 2019
@ahejlsberg ahejlsberg deleted the instantiationCountLimiter branch July 3, 2019 00:19
@AnyhowStep
Copy link
Contributor

AnyhowStep commented Jul 26, 2019

I just tried out a build containing this change.

I think tsserver isn't applying the 5M limit but tsc is.

Right now, I have a problem where VS code is inferring the return type of my 14M function just fine. No red squiggly lines or anything.

However tsc dies and says,

error TS2589: Type instantiation is excessively deep and possibly infinite.

To create a repro is going to take me a while because there's a lot of code involved but I'll open an issue when I get to it.

Just thought I'd post this comment and see if anyone here has any clue why this might happen


[Edit]

I think it might have to do with tsc not checking the dist folder now when building?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compiler hangs
7 participants